Skip to content

Frequently asked questions

Gilles Quénot edited this page Apr 30, 2023 · 7 revisions

Getting the last node

Q: How do I get the last node? //foo//bar returns all bars, but I only want the last one, and //foo//bar[last()] did not work.

    <bar>First </bar>
    <bar>Second </bar>
    <bar>Third </bar>
    <bar>Fourth </bar>

A: //foo//bar[last()] would return the last bar of its parent, in the example Second and Fourth

You need (//foo//bar)[last()] to get the last of those.

Getting nodes with an attribute

Q: I want to extract the title attribute from links whose href contains the string "contentFile.aspx".

This command returns the href, but I do not know how to get the Title contents instead.

xidel --xquery '//a/@href[contains(., "contentFile.aspx")]'

A: You can go back from the @href to the corresponding a:

xidel --xquery '//a/@href[contains(., "contentFile.aspx")]/../@title'

Or you can put the condition on the a:

xidel --xquery '//a[@href[contains(., "contentFile.aspx")]]/@title'


xidel --xquery '//a[contains(@href, "contentFile.aspx")]/@title'

Getting nodes containing text

Q: How do you find tags which include a certain text?

A: You can use contains or matches on these nodes. E.g.

xidel input.html -e '//*[contains(., "searched text")]'

finds all nodes containing text as well as their ancestors, because a node containing a node containing text contains the text, too.

To find the nodes without ancestors, you can check only the direct text of the nodes:

xidel input.html -e '//*[text()[contains(., "searched text")]]'

This is also much faster, however texts that span multiple nodes are not found, e.g. in <span>foo<b>bar</b></span> either foo or bar can be found with text(), but not foobar.

When "searched text" is a regular expression, you can use matches in place of contains.

Replacing empty/null nodes

Q: How to return a default value, if the input is empty?

A: For inputs that have at most one value use:

(input, "default value")[1]

[1] returns the first value of a sequence, so it will return input if input exists. If input is empty, the sequence becomes ("default value")[1], so it will return "default value".

Deletion of nodes

Q: How do I delete the div from

    <span>I want to keep this</span>
    <div class="I_want_to_delete_this">
        <span>blah< blah/span>
    <span>I want to keep this too</span>

to get something like

    <span>I want to keep this</span>
    <span>I want to keep this too</span>


A: All data is immutable, so you cannot delete something from a document, but you can create a new document without these nodes.

For example using the x:replace-nodes function:

xidel --xml -e 'x:replace-nodes(//div[@class="I_want_to_delete_this"],())' xx.xml 

Or x:transform-nodes function:

xidel -s input.xml -e '
      if (name($x)="div" and $x[@class="I_want_to_delete_this"])
      then ()
      else $x
' --output-node-format=xml --output-node-indent


xidel -s input.xml -e '
  let $delete:=//div[@class="I_want_to_delete_this"] return
    function($x){if ($delete[$x is .]) then () else $x}
' --output-node-format=xml --output-node-indent


xidel -s input.xml -e '
  let $delete:=//div[@class="I_want_to_delete_this"] return
    function($x){$x[not($delete[$x is .])]}
' --output-node-format=xml --output-node-indent

Using Xidel in a shell pipeline | xidel

Q: Is there any way of processing output from another script in xidel, i.e. is there any option to tell xidel to grab the content like this: grep foobar test.html | xidel ...

A: If you give it a dash - as file name it reads the pipe input.

 grep foobar test.html | xidel - ...


Also look here for things to avoid: